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19.  ABSTRACT 


action  is  mediated  at  dopamine  D2  receptors.  This  conclusion  is  supported  by  positive 
experiments  with  the  selective  D2  receptor  agonist,  N-0437,  which  may  be  substituted  for 
dopamine  as  a  reinforcer  in  neuronal  operant  conditioning.  The  D2  receptor  re.inforcement 
hypothesis  also  is  supported  by  a  failure  of  the  selective  dopamine  D1  antagonist, 

SCH23390,  to  block  dopamine-reinforced  operant  conditioning.  Preliminary  results  with 
electrical  stimulation  as  reinforcement  in  brain  slice  experiments  also  indirectly  supports 
the  dopamine  reinforcement  hypothesis.  In  these  experiments,  mild  electricl  stimulation 
in  the  vicinity  of  dopamine  terminals  in  the  nucleus  accumbens  reinforced  the  bursting 
activity  of  accumbens  cells.  Noncontingent  applications  of  the  same  electric  stimulus 
failed  to  increase  the  rate  of  bursting. 

We  lia^e  begun  to  study  the  effects  of  delaying  the  presentation  of  reinforcemem.  ±n 
neuronal  oyerant  conditioning.  Preliminary  results  suggest  that  zero  delay  is  optimal 
and  that  a  delay  as  short  as  0.5  sec.  largely  eliminates  the  effectiveness  of  the  rein¬ 
forcing  stimulus.  A  steep  gradient  of  delayed  primary  reinforcement  also  was  obtained 
in  behavioral  operant  conditioning  (brain  self-stimulation  test) . 

Finally,  we  have  begun  to  consider  the  biochemical  events  that  may  mediate  the  cellular 
reinforcement  process.  Modification  of  membrane  proteins  that  control  cellular  firing 
rates  is  envisioned  to  occur  only  in  recently-active  cells  primed  by  the  influx  of  CA 
via  a  biochemical  cascade  triggered  by  reinforcing  transmitters  or  drugs. 
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Introduction 

\  V 

•^This  research  program  is  based  on  the  assumption  that  human 
problem-solving  behavior  has  evolved  from  the  goal-seeking  bra'n  functions  of 
lower  forms.  These  functions  in  turn  depend  on  a  capacity  for  behavior  to  be 
strengthened  or  positively  reinforced  by  its  consequences,  a  process  Skinner 
(1938)  terms  operant  conditioning.  A  critical  problem  is  to  identify  the 
functional  brain  unit  whose  activity  is  modified  by  the  reinforcement  process. 
Our  early  work  suggests  that  the  individual  brain  cell  may  serve  as  such  a 
functional  unit,  leading  us  to  identify  the  "reinforced"  neuron  rather  than  the 
neuronal  network  as  the  unit  of  goal-seeking  behavior.  If  these  assumptions 
are  correct,  it  follows  that  the  fundamental  mechanisms  of  adaptation 
underlying  human  intelligence  reside  at  least  in  part  at  the  level  of  individual 
cells.  Elucidation  of  the  cellular  mechanisms  of  operant  conditioning  may  have 
important  implications  for  adaptive  network  research. 

Specific  objectives  of  this  research  included:  1)  demonstration  that  the 
activity  of  individual  neurons  in  fact  is  susceptible  to  operant  conditioning, 
2)  determination  of  the  properties  and  limits  of  such  neuronal  operant 
conditioning,  3)  investigation  of  the  biochemical  events  that  may  mediate  the 
cellular  reinforcement  process,  and  4)  comparison  of  the  properties  of  neuronal 
and  behavioral  operant  conditioning  in  order  to  determine  important  similarities 

and  differences.  Kfi  u  ujoriis',  C  or\d.W\0»\\l\<\  ( WrtVv  lOl 

Methods 


Brain-Slice  Preparation 


Rats  were  decapitated  and  their  brains  rapidly  removed  (60-90  sec)  and 
chilled  to  6®C  in  oxygenated  artificial  cerebrospinal  fluid  (ACSF;  Dingledine,  et 
al.,  1980).  Using  plastic  tools,  the  hippocampal  region  was  rapidly  dissected 
and  rinsed  repeatedly  with  cold  ACSF  to  minimize  cell  damage.  The 
hippocampus  was  positioned  on  a  Mcliwain  chopper  at  an  angle  that  provided 
parasagittal  sections  (15-30*)  and  six  400-/1  slices  were  obtained  (Tyler,  1980). 

The  slices  were  individually  transferred  to  ice-cold  ACSF  using  a  soft  brush 
and  carefully  placed  on  the  nylon  mesh  surface  in  a  static  chamber  using  an 
eye  dropper.  The  slices  were  supported  at  the  surface  of  ACSF  solution  in  an 
oxygenated  atmosphere  (95/5  0,/C02,  500  ml/min)  at  35*C.  At  least  1  hour  of 

incubation  was  allowed  for  recovery  of  physiological  activity  prior  to  the  start  •— - 

of  experiments  (Schwartzkroin,  1981).  Fresh  ACSF  was  infused  into  the  static'^'  * 
chamber  every  30-45  minute  or  at  the  end  of  each  experiment.  GRA&X 

TAB 


Extracellular  Recording  and  Pressure  Microin jection 

Single-barrel  micropipette  blanks  (Omega  Dot)  were  pulled  and  back-filled 
with  test  solution  or  vehicle  (365  mM  saline).  The  micropipette  was  connected 
to  a  pressure  injector,  and  the  tip  broken  back  under  microscopic  control  to 
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produce  a  droplet  approximately  18/j  in  diameter  at  an  injector  setting  of 
15  p.s.i  and  35  ms.  Using  a  micropositioner,  the  micropipette  was  visually 
guided  to  targeted  cells  and  slowly  lowered  until  a  suitable  action  potential 
was  obtained.  Unit  activity  was  displayed  on  a  digital  storage  oscilloscope  and 
monitored  on  a  loud  speaker.  These  displays  were  monitored  for  similarity  of 
amplitude  and  waveform  throughout  the  experiment  to  insure  that  action 
potentials  from  the  same  cell,  and  only  from  that  cell,  were  counted. 
Important  criteria  for  the  selection  of  suitable  cells  included  a  signal-to-noise 

ratio  of  at  least  4:1  and  relatively  stable  levels  of  baseline  activity.  Action 

potentials  were  led  into  an  amplitude  analyzer,  the  output  of  which  provided 
digitized  input  to  the  computer.  A  minicomputer  was  programmed  to  count 
unit  activity,  activate  the  injection  pump,  store  data  on-line  and  analyze  data 

off-line.  A  7-channel  FM  recorder  provided  a  permanent  record  of  all 
essential  experimental  events  in  sequence  for  later  analysis. 

A  high-pressure  microinjection  system  was  used  for  rapid  extracellular 

delivery  of  picoliter  volumes  of  neurotransmitters  and  drugs.  Pressure 
injection  is  required  for  immediate  delivery  of  reinforcing  solutions  with 
injection  durations  as  short  as  5  ms.  High-pressure  nylaflow  tubing  was  used 
to  connect  the  injection  pump  to  the  micropipette. 

Single-Unit  Operant  Conditioning  Procedures 

The  experimental  protocol  is  diagrammed  in  Figure  1.  A  somewhat 
arbitrary  decision  was  made  in  choosing  which  aspect  of  unit  activity  to 

reinforce.  Since  firing  rates  are  likely  to  be  an  important  vehicle  for 

information  transmission,  peak  rates  should  have  high  information  value  and 

might  be  amenable  to  conditioning.  Thus,  in  initial  experiments,  a  half-second 
period  of  relatively  rapid  activity  was  defined  as  the  neuronal  response  to  be 
reinforced  (Fig.  2).  These  neuronal  responses  or  "bursts8  were  individually 
determined  for  each  unit  studied.  Prior  to  the  start  of  conditioning,  500 
successive  half-second  samples  of  neuronal  activity  were  recorded  and  a 

frequency  distribution  of  the  number  of  spikes  per  sample  was  compiled.  A 
"burst8  was  defined  as  that  spike  number  equalled  or  exceeded  in  only  2-6 

percent  of  the  samples.  During  operant  conditioning,  reinforcements  were 

delivered  at  the  end  of  the  half-second  time  sample  containing  such  bursts  of 

firing.  To  minimize  injection  artifacts,  neuronal  activity  during  and  for  3  sec 

after  each  injection  was  excluded  from  analysis  and  had  no  consequences. 

In  later  experiments,  the  computer  program  was  modified  to  permit 

explicit  detection  of  bursts  of  firing.  in  the  modified  program,  a  burst  is 

defined  as  a  train  of  firing  containing  n  or  more  spikes  with  a  maximum 
interspike  interval  of  t  ms:  an  example  is  'hown  in  Figure  3  where  n  «*  S  and 
t  *  10  ms.  Again,  parameters  ^ere  set  for  individual  brain  cells  so  that,  on 

baseline,  bursts  occurred  at  a  rate  of  approximately  2-6  per  min.  Because  the 
new  program  detects  the  occurrence  of  bursts,  reinforcements  could  be 
programmed  to  coincide  precisely  with  the  termination  of  bursts  or  to  follow 

bursts  after  specified  delays. 
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figure  1.  Protocol  for  operant  conditioning  of  individual  brain  cells. 
A  burst  of  firing  of  a  hippocampal  pyramidal  cell  in  area  CA1 
activates  a  pressure  injection  pump  which  puffs  a  microinjection  of 
dopamine  or  cocaine  in  the  close  vicinity  of  the  cell  soma. 


The  neuronal  operant-conditioning  method  involved  six  stages: 
1)  Baseline.  The  number  of  "bursts"  in  the  absence  of  reinforcement  (operant 
level)  was  determined  during  a  baseline  period  of  approximately  !0  minutes. 
2'  Operant  Conditioning.  Each  "burst"  was  now  followed  by  an  injection  of 
die  reinforcing  solution.  If  conditioning  failed  to  occur  after  5  minutes,  the 
duration  of  the  injection  (and  hence  the  dose)  was  increased  until  evidence  of 
conditioning  was  obtained,  or  until  direct  pharmacological  or  mechanical  effects 
interfered  with  recording.  3)  Extinction.  Reinforcement  was  terminated,  and 
recording  continued  until  the  baseline  was  recovered.  4)  Matched  "Free" 
Injections.  Noncontingent  injections  of  the  reinforcing  solution  were  made  at 
regular  intervals  to  determine  d.rect  pharmacological  effects  on  rates  of  firing 
and  probability  of  "bursts."  The  Pattern  and  maber  of  "free"  injections  were 
matched  to  the  pattern  and  numb  *  of  reinf  •  :  ^  irjtctions  in  the  preceding 
phase  of  operant  conditioning.  The  presentation  ot  programmed  free  injections 
was  delayed  for  500  ms  after  the  occurrence  of  “bursts"  to  minimize  their 
adventitious  reinforcement.  (Thus,  to  some  extent,  the  control  involved 
counterconditioning  rather  than  random  presentation  of  reinforcement.) 
5)  Washout.  A  second  baseline  period  without  injections  was  given  in  order  to 
allow  residual  effects  of  the  noncontingent  drug  administrations  to  be 
dissipated.  6)  Reacquisition.  A  second  period  of  reinforcement  was  scheduled, 
whenever  possible,  in  order  to  compare  rates  of  original  acquisition  and 
reacquisition. 
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Figure  2.  Diagram  of  procedure  for  defining  and  reinforcing 
neuronal  responses  or  "bursts'*.  Spike  activity  is  counted  and 
summed  arbitrarily  in  bins  of  0.5-sec  duration.  Prior  to  experiment 
proper,  baseline  recordings  are  made  for  each  neuron  under 
investigation  to  determine  a  suitable  response  for  later 
reinforcement.  Bins  that  contain  »  or  more  spikes  are  followed  by 
reinforcement,  where  h  is  that  number  of  spikes  in  a  bin  that  is 
equalled  or  exceeded  in  2-6  percent  of  all  bins  sampled.  At  the 
bottom,  a  free  injection  is  programmed  after  the  10th  bin,  and  is 
delivered  if,  as  shown,  the  bin  does  not  contain  a  burst.  •  »  Burst. 


Brain  Self -Stimulation  Methods 

The  brain  self-stimulatio:.  methods  have  been  reported  previously  (Black 
et  al,  1985).  Briefly,  animals  were  implanted  with  bipolar  electrodes  and  tested 
for  brain  stimulation  reinforcement  in  a  28  x  25  x  30  cm  high  chamber  with  a 
lever  in  the  rear  wall.  Each  response  delivered  a  150-ms  train  of  0.2-ms 
monophasic  rectangular  pulses  at  a  frequency  of  100  Hz  and  current  intensities 
of  75-400  pA.  For  initial  drug  testing,  current  intensities  were  individually 
adjusted  to  the  lowest  value  that  maintained  stable  rates  of  self-stimulation. 
Stimulus  delivery  and  response  recording  (cumulative  records  and  numerical 
print-outs)  were  under  computer  control. 

The  effects  of  drugs  on  the  rewarding  properties  of  brain  stimulation  also 
were  studied  in  a  self-stimulation  test  using  nose-poke  as  the  operant 
response.  This  test  has  been  shown  to  be  less  sensitive  to  motor  debilitating 
effects  of  drugs  than  tesrs  using  the  lever-press  response  and  thus  provided  a 
control  for  nonspecific  side  effects.  Further  analysis  included  measures  of 
latency  to  respond  and  identification  of  extinction-like  suppression  patterns 
that  indicate  a  receptor-mediated  reward  decrement  process. 
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Figure  3.  Burst  of  neuronal  activity  recorded  from  a  hippocampal 
CA1  cell  (upper  trace).  This  pattern  of  firing  was  arbitrarily 
defined  as  a  reinforceable  response  or  "burst"  and  consists,  for  this 
unit,  of  a  train  of  5  or  more  spikes  with  a  maximum  interspike 
interval  of  10  ms.  Lower  trace  shows  1-ms  rectangular  pulses  which 
mark  each  spike  that  is  detected  by  an  amplitude  discriminator. 

Conditioned  Place-Preference 

Animals  were  tested  in  an  apparatus  which  consisted  of  two  large 
chambers,  one  black  plexiglas  and  the  other  white  plexiglas,  separated  by  a 
small  central  "neutral"  area  which  was  gray.  The  black  compartment  had  a 
grid  floor,  wood  shavings  under  the  floor,  and  soap  solution  applied  to  the 
walls;  the  white  compartment  had  mesh  flooring,  corn  cob  Utter  under  the 
floor,  and  ethanol  solution  applied  to  the  walls.  Time  spent  in  each  chamber 
was  detected  by  microswitches  under  each  floor  that  were  connected  to  a 
computer. 


The  conditioned  pl3ce-proeedure  consisted  of  three  phases. 

Preconditioning  (D*y*  1-3):  each  rat  was  allowed  to  investigate  the  apparatus 
for  15  min  per  day  for  3  consecutive  days.  The  time  spent  in  each  of  the 
large  compartments  on  the  third  day  was  used  to  determine  the  initial 
unconditioned  preference  for  the  two  sides.  Conditioning  (Days  4-11):  each  rat 
received  4  daily  injections  of  the  drug  treatment,  administered  every  other 
clay.  Following  drug  administration,  the  rat  was  confined  in  the  less  preferred 
environment  for  30  min.  Alternating  with  these  treatments,  each  rat  also 
received  4  presentations  of  vehicle  on  intervening  days,  and  these  were  paired 
with  the  initially  more  preferred  side  of  the  apparatus.  Test  (Dsy  12):  no 
injections  were  administered  and  each  rat  was  placed  in  the  central  area  of 

the  apparatus  and  the  time  spent  m  each  large  compartment  was  recorded  for 
15  min.  The  extent  of  place  conditioning  was  determined  by  comparing  time 
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spent  in  the  less  preferred  compartment  on  Day  3  with  time  spent  in  the  same 
compartment  on  Day  12. 


Results 

Evidence  of  Neuronal  Operant  Conditioning  ( *13,  *14)* 

Results  from  a  representative  positive  experiment  using  dopamine  as  the 
reinforcing  solution  are  shown  for  a  hippocampal  unit  in  Figure  4.  In  two 
separate  periods  of  operant  conditioning  (REINF),  the  frequency  of  "bursts" 
and  the  overall  firing  rate  were  rapidly  increased  after  approximately  5 
dopamine  reinforcements.  The  same  dopamine  injections  administered 
noncontingently  (MATCH)  failed  to  increase  either  “burst"  frequency  or  overall 
firing  rate.  Because  neuronal  activity  was  not  increased  by  these 
noncontingent  administrations,  we  can  rule  out  the  possibility  that  direct 
stimulant  effects  of  dopamine  caused  the  increases  in  neuronal  activity  that 
were  observed  in  the  reinforcement  periods.  Accordingly,  we  tentatively 
attribute  these  reinforcement  induced  increases  to  a  neuronal  process  akin  to 
operant  conditioning.  Note  that  the  firing  rate  turned  down  at  the  end  of 
both  reinforcement  periods.  This  effect  typically  is  observed  if  high  rates  of 
bursting  have  been  generated  by  the  reinforcement  procedure,  and  we 
tentatively  attribute  it  to  a  direct  inhibitory  effect  of  dopamine  when  the 
reinforcement  density  (and  therefore  the  local  dopamine  concentration)  is 
excessive  In  an  effort  to  protect  the  unit  from  excessive  dopamine 
concentrations,  we  typically  terminate  the  reinforcement  period  at  the  point 
that  the  acquisition  curve  turns  down.  In  the  experiment  shown  in  Figure  4, 
rates  of  bursting  and  overall  firing  continued  to  decline  sharply  after 
reinforcement  had  been  terminated,  suggesting  rapid  extinction  of  neuronal 
operant  conditioning.  Other  units,  how  r,  sometimes  respond  for  protracted 
periods  in  extinction  (e.g.,  see  Fig.  7). 

The  data  in  the  curves  shown  in  the  lower  half  of  Figure  4  are  replotted 
as  cumulative  records  of  bursting  in  Figure  5.  These  replots  are  intended  to 
facilitate  comparison  with  behavioral  operant  conditioning  data  (which  are 
conventionally  displayed  as  cumulative  response  curves).  The  neuronal  data  are 
now  seen  to  closely  resemble  behavioral  acquisition  curves  (Skinner,  1938). 
Two  additional  features  of  the  neuronal  data  also  are  evident  in  the  replots. 
First,  the  slope  of  the  cumulative  response  eurve  in  the  second  reinforcement 
period  is  somewhat  sharper  than  that  in  the  first  period,  suggesting  a  neuronal 
equivalent  of  enhanced  reacquisition  or  "savings".  Secondly,  the  response  rate 
in  the  second  extinction  period  substantially  exceeded  that  in  the  first,  again 
suggesting  some  persistant  effect  of  reinforcement.  Both  of  these  features  are 
typical  of  behavioral  operant  conditioning  (see  Fig.  6). 


Numbers  refer  to  project  publications  listed  on  pages  31-32. 
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Figure  4.  Operant  conditioning  of  the  activity  of  a  CAI  pyramidal 
cell  in  a  slice  of  dorsal  hippocampus  using  local  injections  of 
dopamine  as  reinforcement.  The  activity  of  the  unit  throughout 

seven  phases  of  a  complete  experiment  is  shown.  Each  point  shows 
the  number  of  'bursts*  (lower  graph)  and  the  total  number  of  spikes 
(upper  graph)  in  successive  blocks  of  100  half-second  samples  or 
trials.  Prior  to  the  first  baseline  phase,  a  “burst*  criterion  of  4  or 
more  spikes  per  half -second  sample  was  selected.  This  criterion 
gave  a  'burst*  rate  for  this  unit  that  never  exceeded  4  percent  in 
the  initial  baseline  period  (BASE).  In  the  reinforcement  period 

(REINF),  dopamine  HCI  (I  mM  in  165  mM  saline)  was  applied  for 
5  ms  immediately  alter  each  “burst*.  Following  a  second  baseline 
period,  the  same  dopamine  injections  were  delivered  (MATCH) 
independently  of  the  unit's  behavior  as  a  control  for  possible 

stimulaat  effects.  The  number  of  injections  was  matched  to  that 
earned  during  the  last  four  periods  of  the  reinforcement  phase, 

"Burst*  and  overall  spike  rates  were  increased  by  the  contingent 
dopamine  injections  during  the  reinforcement  periods,  but  were  not 
increased  when  the  same  injections  were  administered 
noncomingently  in  the  matched-injection  period.  Inset  (upper 
trace)  photograph  of  oscilloscope  display  of  two  action  potentials 
from  the  unit  undergoing  conditioning,  and  (tower  trace)  I -ms  time 
markers. 
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Results  from  a  positive  experiment  with  cocaine  as  reinforcement  are 
shown  in  Figure  7.  Initially,  free  injections  of  cocaine  delivered  at  a  rate  of 
approximately  5  per  minute  had  no  effect  on  the  frequency  of  "bursts"  or  on 
the  overall  firing  rate.  In  the  first  reinforcement  period,  after  approximately 
10  applications  of  cocaine,  the  frequency  of  "bursts*  and  the  overall  firing  rate 
were  sharply  increased;  again,  both  curves  turned  down  at  the  end  of  the 
period,  presumably  because  of  an  excessive  local  cocaine  concentration.  Unlike 
the  experiment  shown  in  Fig.  4,  neuronal  firing  rates  in  the  baseline  period 
that  followed  the  first  phase  of  reinforcement  did  not  extinguish  rapidly; 
indeed,  the  peak  firing  rates  achieved  in  the  reinforcement  phase  were 
sustained  for  several  minutes  after  the  onset  of  extinction.  Free  cocaine 
injections  ("MATCH")  then  were  delivered  at  a  rate  of  approximately  12  per 
minute  to  match  the  peak  rate  obtained  in  the  preceding  reinforcement  period. 
These  densely-packed  free  injections  had  no  effect  on  the  number  of  "bursts" 
or  on  the  overall  firing  rate.  In  a  second  reinforcement  period,  contingent 
injections  of  cocaine  again  increased  the  frequency  of  "bursts*  and  the  overall 
firing  rate,  but  not  to  the  level  observed  in  the  first  reinforcement  period. 


Figure  5.  "Burst*  data  shown  ,a  lower  half  of  Fig.  4  roploited  as 
cumulative  curves  of  bursting. 


In  control  experiments,  either  saline  was  substituted  for  dopamine  (Fig  S) 
or  dopamine  was  administered  noncontingently  throughout  the  experiment  (Fig. 
9).  In  these  experiments,  neither  "bursts*  nor  overall  firing  rates  were 
increased.  A  summary  of  8  positive  dopamine  experiments  in  which  it  was 
possible  to  complete  two  reinforcement  periods- -as  exemplified  in  the 
experiment  shown  in  Figure  4--is  shown  in  Figure  10.  Plotted  here  for  8 
different  neurons  are  the  mean  peak  rates  obtained  at  each  stage  of  the 
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experiment.  Significant  increases  were  obtained  in  each  of  the  reinforcement 
periods  when  compared  either  to  baseline  control  periods  or  to  periods  in 
which  the  same  dopamine  injections  were  presented  independently  of  neuronal 
bursting.  A  similar  summary  of  11  positive  cocaine  experiments  is  shown  in 
Figure  11. 


Figure  6.  Results  of  a  behavioral  self-stimulation  experiment  which 
was  designed  to  replicate  the  neuronal  operant  conditioning 
experiments.  A  nose- poke  response  was  substituted  for  the  burst  of 
firing  and  electrical  brain  stimulation  reinforcement  was  substituted 
for  reinforcing  drug  injections.  Experimentally  naive  rats, 

previously  implanted  with  medial  forebrain  bundle  electrodes,  were 
placed  in  a  Skinner  box  and  trained  under  tha  same  alternating 
contingencies  used  in  the  neuronal  experiments:  REiNF  <*  each  nose- 
poke  response  is  reinforced  with  a  0.  J  5-sec  train  of  brain 
stimulation;  BASE  -  each  nose-poke  is  recorded  but  has  no  other 
programmed  contingencies;  FREE  «  brain  stimulations  are  delivered 
noncontingently  and  matched  in  rate  to  that  observed  in  the  prior 
reinforcement  period.  Note  that  response  rate  is  sharply  increased 
by  reinforcement,  that  it  declines  rapidly  during  extinction,  and  that 
noncontingent  administrations  of  brain  stimulation  do  not  increase 
nose-poking  above  the  baseline  level.  Note  further  that  response 
rates  in  the  second  reinforcement  period  exceed  that  in  the  first. 


These  positive  resuits  wsth  dopamine  and  cocaine  contrast  with  the 
negative  findings  of  experiments  in  which  a  variety  of  other  transmitters  and 
drugs  were  surveyed  (Table  U.  In  the  columns  labelled  ’RESULT'S*,  the 
designations  are  as  follows:  **  ■  evidence  of  operant  conditioning  (increased 
bursting  in  reinforcement  periods  and  no  such  increase  in  noncontingent 
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Figure  7.  Operant  conditioning  of  a  pyramidal  neuron  in  a  dorsal 
hippocampal  slice  using  local  injections  of  cocaine  as  reinforcement. 
For  details,  see  text  and  Fig.  4.  FREE  =  noncontingent  injections. 
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Figure  8.  Saline  control  experiment.  Failure  to  obtain  evidence  of 
operant  conditioning  of  a  pxramidal  neuron  in  dorsal  hippocampal 
slice  with  local  injections  of  saline  as  reinforcement.  For  details, 
see  text  and  Fig.  4. 


L.  Stein  &  J.D.  Belluzzi 


Progress  Report 
AFOSR  Grant  #84-0325 


mu  *o*«»m*  «**# 

iiomm 


-~y 


ACT**  »OT««*UU  Of  W»U«0«l 


_ r^. 


<« 

«* 

I  «• 


MBW*  »  M  MM  M 


'V 


vs  *v/ 


UUUM  rMl  UMl««  M«1 


H4UI  00  iU4*~4IC0aft  m*iO 


Figure  9.  Control  experiment  with  dopamine  administered 
noncontir^ently  to  a  pyramidal  neuron  in  hippocampal  slice.  For 
details,  r. .  '  >■  *  and  Fig.  4. 


Figure  10.  Sumrmry  of  positive  dopamine  experiments.  Bars  show 
peak  rates  of  bursting  obtained  in  each  phase  of  the  neuronal 
conditioning  experiment,  as  exemplified  in  Figure  4.  N  *  8,  vertical 
lines  represent  S.E.M.S.  *p  <  0.05. 


-12 


L.  Stein  &  J.D.  Belluzzi 


Progress  Report 
AFOSR  Grant  #84-0325 


Figure  11.  Summary  of  positive  cocaine  experiments.  N  »  II, 
vertical  lines  represent  S.E.M.s.  *p  <  0.05.  For  further  explanation 
see  Figures  4  and  10. 


Table  1.  Summary  of  hippocampal  brain-slice  experiments. 


Drug 

Dose  (mM) 

No.  of 

Exps. 

RESULTS* 

++  +  - 

Cocaine 

1 

48 

11 

12 

25 

Cocaine  (Free) 

1 

13 

0 

0 

13 

Dopamine 

l 

17 

9 

2 

6 

Dopamine  (Free) 

1 

12 

0 

1 

11 

Norepinephrine 

l 

4 

1 

I 

2 

Acetylcholine 

6 

1 

t 

4 

Serotonin 

I 

3 

0 

0 

3 

GABA 

1 

4 

0 

0 

4 

Amphetamine 

1 

«* 

3 

0 

2 

1 

tmip  famine 

0 

0 

2 

Ethanol 

1 

3 

0 

0 

3 

Saline 

165 

5 

0 

0 

5 

•Columns  are  defined  as  follows:  ♦♦  *  conditioning-like  changes  (increased 
probability  of  bursts  following  reinforcement)  plus  noncontingent  controls, 

+  »  conditioning-like  changes,  but  no  controls,  -  a  no  evidence  of  conditioning. 

(Free)  ■  noncontingent  injections. 
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control  periods),  +  -  conditioning-like  increases  but  no  noncontingent  controls, 
and  -  ■  no  evidence  of  conditioning.  The  table  thus  indicates  that  9  of  the  17 
dopamine  experiments  (or  slightly  more  than  50%)  were  positive  and  contained 
noncontingent  controls.  In  the  cocaine  experiments,  a  similar  percentage  of 
neurons  exhibited  increased  bursting  in  reinforcement  periods,  but  it  was  more 
difficult  to  obtain  adequate  noncontingent  controls  in  the  same  experiments. 


Evidence  of  Dopamine  Receptor  Specificity  ( #i.  #5,  #12) 

Dopamine  receptor  antagonists  were  studied  in  neuronal  operant 
conditioning  experiments  in  an  attempt  to  determine  whether  dopamine's 
reinforcing  action  is  specifically  exerted  at  a  dopamine  receptor  or  is  due  to 
some  nonspecific  action  of  dopamine.  In  initial  experiments  (#3),  the  mixed 
dopamine  D1  and  D2  receptor  antagonist  chlorpromazine  completely  blocked 
dopamine's  reinforcing  action  in  neuronal  operant  conditioning  (Fig.  12).  In 
these  experiments,  hippocampal  units  reinforced  with  dopamine  (DA-REINF) 
again  »*  'hib.  .cd  significantly  higher  bursting  rates  than  control  neurons 
reinforced  with  saline  (SAL-REINF).  When  chlorpromazine  was  added  to  the 
dopamine  solution  (DA  +  CPZ),  the  reinforcing  action  of  dopamine  was 
abolished;  indeed,  the  dopamine-chlorpromazine  mixture  apparently  suppressed 
the  rate  of  oursting  belov  the  saline  control  and  below  those  neurons  that  had 
received  chlorpromazine  ahme  (CPZ)  as  'enforcement. 

The  availability  of  new  drugs  with  greater  selectivity  than  chlorpromazine 
has  enabled  us  tc  distinguish  be'ween  effects  exerted  at  dopamine  Dl  and  D2 
receptors  (Fig.  13).  When  the  selective  D2  antagonist,  sulpiride,  was  added  to 
dopamine  (DA  *  SUL),  the  reinforcing  action  of  dopamine  was  abolished  and 
the  rate  of  bursts  was  supptessed  to  the  saline  control  level.  On  the  other 
hand,  when  the  dopamine  T1  receptor  antagonist,  SCH  23390,  was  mixed  with 
dopamine  (DA  +  SCH),  the  reinfor  ing  action  of  dopamine  was  unaffected  or 
possibly  even  slightly  increased.  These  results  suggest  that  dopamine’s 

reinforcing  effects  are  exerted  at  dopamine  02  receptors.  This  conclusion  is 
supported  by  positive  experiments  with  the  D2  receptor  agonist,  N-0437,  which 
may  be  substituted  for  dopamine  as  an  effective  reinfo,  cer  in  neuronal  operant 
conditioning  (Fig.  14).  Although  higher  concentrations  of  N-0437  than 
dopamine  were  required  for  neuronal  operant  conditioning,  it  is  our  impression 
that  at  these  higher  concentrations  N-u437  is  a  more  reliable  reinforcing  agent. 

Preliminary  results  with  electrical  stimulation  as  reinforcement  in  brain 
slice  experiments  also  indite^tly  supports  the  dopamine  reinforcement 
hypothesis.  In  these  experiments,  mild  electrical  stimulation  delivered  directly 
to  a  localized  site  in  the  brain  slice  was  substituted  for  the  reinforcing 
dopamine  injections  in  a  typical  neuronal  ^perant  conditioning  procedure.  The 
parameters  of  electrical  stimulation  were  identical  to  those  ustu  in  behavioral 
self-stimulstion  studies.  In  nucleus  arcumbens  brain  slices,  mild  electrical 
stimulation  in  the  presumed  vicinity  of  the  doparine  projections  reinforced  the 
bursting  of  accumbens  cells  (Fig.  15).  Noncontingent  applications  of  the  same 
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TREATMENT 


Figure  12.  Chlorpromazine  blocks  operant  conditioning  of  individual 
CA1  cellular  activity  in  slices  of  hippocampus,  using  local 
applications  of  dopamine  as  reinforcement  (see  Methods  and  Fig.  4 
for  procedure).  Neurons  reinforced  with  1-mM  dopamine 
(DA-REINF)  exhibited  significantly  more  "bursts"  than  controls 
reinforced  with  saline  (SAL-REINF).  When  1-mM  chlorpromazine 
was  added  to  the  dopamine  solution  (DA  +  CPZ),  the  reinforcing 
action  of  dopamine  was  abolished  and  the  rate  of  "bursts"  was 
suppressed  below  the  saline  control.  Neurons  that  received 
chlorpromazine  alone  (CPZ)  exhibited  the  same  number  of  "bursts"  as 
those  that  had  received  saline.  SAL-FREE  -  noncontingent  saline 
injections;  DA-FREE  •  noncontingent  dopamine  injections. 
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Figure  13.  Sulpiride,  but  not  SCH  23390,  blocks  operant  conditioning 
of  individual  CA1  cellular  activity  in  slices  of  hippocampus,  using 
applications  of  dopamine  as  reinforcement  (see  Methods  and  Fig.  4 
for  procedure).  Neurons  reinforced  with  1-rnM  dopamine 
(DOPAMINE)  exhibited  significantly  more  bursts  than  controls 
reinforced  with  saline  (SALINE).  When  sulpiride  (10  mM)  was  added 
to  the  dopamine  solution  (DA  +  SUL),  the  reinforcing  action  of 
dopamine  was  abolished  and  the  rate  of  bursts  was  suppressed  to  the 
saline  control  level.  On  the  other  hand,  when  1-mM  SCH23390  was 
added  to  the  dopamine  solution  (DA  +  SCH)  the  reinforcing  action  of 
dopamine  was  unaffected. 
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stimulation  failed  to  increase  the  rate  of  bursting.  In  hippocampal  slices, 
however,  similar  electrical  stimulation  experiments  produced  no  evidence  of 
operant  conditioning.  In  this  case  (Fig.  14),  contingent  and  noncontingent 
electrical  stimulation  produced  similar  and  much  smaller  changes  in  the  rates 
of  bursting.  It  is  possible  that  the  positive  results  in  nucleus  accumbens  may 
be  associated  with  the  heavy  density  of  dopamine  projections  to  this  region, 
while  the  negative  results  in  hippocampus  may  be  associated  with  its  much 
thinner  dopamine  innervation. 


CONCENTRATION  CnM) 

Figure  14.  Neuronal  operant  conditioning  obtained  with  N-0437 
reinforcement  as  a  function  of  drug  concentration.  The  reinforcing 
action  of  N-0437  (10  mM)  was  abolished  by  chlorpromazine  (1  mM). 


Effects  of  Delayed  Reinforcement  in  Neuronal  Operant  Conditioning  (#2) 

In  behavioral  operant  conditioning,  it  is  well  established  that  the 

effectiveness  of  the  reinforcement  is  sharply  reduced  when  the  presentation  of 
the  reinforcing  stimulus  is  substantially  delayed  after  the  correct  response 

(Renner,  1964).  The  brain  self-stimulation  method,  by  eliminating  the  necessity 
for  consumatory  responses,  permits  precise  temporal  control  of  the  interval 
between  the  operant  response  and  primary  reinforcement.  Using  this  method, 
we  found  that  delays  even  3s  short  as  one  second  markedly  impede  the 

acquisition  of  self-stimulation  behavior  (Fig.  16).  Demonstration  of  a  similar 
delay-of-reinforcement  decrement  in  neuronal  operant  conditioning  experiments 
would  provide  strong  support  for  the  hypothesis  that  cellular  reinforcement 

processes  underlie  behavioral  reinforcement. 
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ELECTRICAL  STIMULATION 

Figure  IS.  Neuronal  operant  conditioning  experiments  with  electrical 
stimulation  as  reinforcement  in  hippocampal  and  nucleus  accumbens 
brain  slices.  The  electric  stimulus  (50-100  /zA,  100  Hz,  100  ms  in 
duration)  was  delivered  to  a  localized  site  in  the  brain  slice  within 
approximately  1  mm  of  the  recording  micropipette,  either 
contingently  after  bursts  of  firing  (REINFORCING),  or  independently 
of  neuronal  activity  (NONCONTINGENT).  Bars  show  peak  rates  of 
bursting  obtained  with  each  procedure  as  a  percent  of  baseline.  In 
nucleus  accumbens,  very  large  increases  in  bursting  were  obtained 
with  reinforcing  stimulation;  these  increases  are  suggestive  of 
operant  conditioning  since  noncontingent  stimulation  was  ineffective. 
In  hippocampus,  on  the  other  hand,  there  was  no  evidence  of 
operant  conditioning  since  reinforcing  and  noncontingent  stimulation 
produced  equal  (and  much  smaller)  increases  in  bursting. 


Because  N-Q437  produces  highly  reliable  baselines  of  operant  conditioning, 
this  compound  was  used  as  the  reinforcing  substance  .n  our  initial  work  on  the 
delay  of  reinforcement  problem.  A  representative  experiment  comparing  the 
efficacy  of  immediate  and  delayed  reinforcement  is  shown  in  Figure  17. 
Immediate  and  delayed  reinforcement  procedures  were  identical,  except  that  the 
delay  procedure  interposed  an  interval  of  500  ms  between  the  last  spike  in  the 
burst  and  the  presentation  of  reinforcement  period  (DELAYED  REINF).  After 
causing  a  brief  increase  in  the  bursting  rate,  delayed  reinforcement  had  no 
sustained  effect  or  perhaps  even  suppressed  the  rate  of  bursting.  On  the 
other  hand,  in  a  subsequent  period  of  immediate  reinforcement  (IMMEDIATE 
REINF),  bursting  rates  increased  sharply  in  a  characteristic  acquisition  curve. 
A  similar  result  is  shown  for  a  second  unit  in  an  experiment  in  which  the 
sequence  of  immediate  and  delayed  reinforcement  was  reversed  (Fig.  18).  The 
efficacy  of  operant  conditioning  associated  with  reinforcement  delays  of  0,  100, 
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200,  or  500  ms  was  determined  in  an  experiment  involving  32  units;  each  unit 
received  operant  conditioning  at  a  single  reinforcement  delay.  A  delay-of- 
reinforcement  gradient  was  generated  by  averaging  the  peak  bursting  rates  at 
each  delay  (Figure  19).  The  curve  indicates  that  reinforcement  delays 
exceeding  200  ms  largely  eliminate  the  effectiveness  of  N-0437  reinforcement 
in  CAl  operant  conditioning.  Such  a  steep  gradient  of  reinforcement  delay  is 
consistent  with  that  obtained  in  behavioral  experiments,  and  supports  the  idea 
that  the  neuronal  operant  conditioning  process  may  underlie  the  behavioral 
operant  conditioning  process. 


Possible  Role  of  Norepinephrine  in  Neuronal  Operant  Conditioning 

Because  of  the  important  role  of  norepinephrine  as  a  first  messenger  in 
the  phosphoinositide  sequence,  we  reexamined  the  efficacy  of  norepinephrine  as 
a  reinforcing  substance  in  neuronal  operant  conditioning.  Norepinephrine’s 
triggering  action  in  phosphoinositide  is  exerted  exclusively  at  o^-noradrenergic 
receptors;  it  therefore  seemed  logical  to  retest  norepinephrine  in  a  mixture 
containing  the  ^-noradrenergic  receptor  antagonist,  propranolol,  in  an  attempt 
to  produce  a  relatively  pure  a- noradrenergic  receptor  activation.  Initial  data 
presented  in  Figure  20  in  fact  suggest  that  selective  activation  of 
a- noradrenergic  (NE  +  PROP)  receptors  may  provide  more  effective 
reinforcement  than  simultaneous  activation  of  a-  and  /^-noradrenergic  receptors 


Figure  16.  Acquisition  of  operant  behavior  (hypothalamic  self¬ 
stimulation)  as  a  function  of  reinforcement  delay.  Total  lever-press 
responses  on  Day  1  of  training  are  shown  for  different  groups  of 
animals  reinforced  after  the  indicated  delay.  Note  that  a  delay  of 
only  1  sec  produced  a  rate  decrease  of  approximately  90%.  Bars 
represent  +  S.E.M. 
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BASE  (DELAYED  BASE  IMMEDIATE  BASE 
REINF  REINF 


Figure  17.  A  representative  neuronal  operant  conditioning 
experiment  in  which  the  efficacy  of  immediate  and  delayed  (500  ms) 
reinforcement  are  compared.  The  delayed  reinforcement  procedure 
(DELAYED  REINF)  produced  a  brief,  but  unsustained,  increase  in 
bursting;  on  the  other  hand,  immediate  reinforcement  (IMMEDIATE 
REINF)  produced  a  characteristic  acquisition  curve. 


REINF  REINF 

Figure  18.  A  second  example  of  a  neuron  for  which  a  reinforcement 
delay  of  500  ms  (DELAYED  REINF)  eliminated  the  reinforcing  action 
of  N-0437.  Compare  with  Figure  17. 


L.  Stein  &  J.D.  Belluzzi 


Progress  Report 
AFOSR  Grant  #84-0325 


REINFORCEMENT  DELAY  (ms) 


Figure  19.  Delay  of  reinforcement  gradient  in  neuronal  operant 
conditioning  with  N-0437  (10  mM)  as  reinforcement.  Number  of 
neurons  tested  at  each  reinforcement  delay  indicated  in  parentheses. 
Vertical  lines  represent  +  S.E.M.S. 


together  (NE).  Preliminary  experiments  also  suggest  that  norepinephrine  may 
be  combined  with  otherwise  ineffective  doses  of  N-0437  to  produce  neuronal 
operant  conditioning  (Fig.  21). 


Operant  Conditioning  of  Single  Units  in  Whole  Brain  ( *6) 

Whole  brain  preparations  have  been  used  to  identify  target  cells,  in 
addition  to  hippocampal  CAI  neurons,  that  may  be  suitable  for  operant 
conditioning.  In  these  experiments,  electrical  stimulation  of  the  medial 
forebrain  bundle  (delivered  through  conventional,  permanently  implanted 
electrodes  whose  reinforcing  efficacy  had  previously  been  demonstrated  in 
behavioral  self-stimulation  tests)  provided  reinforcement  for  neuronal  operant 
conditioning.  The  rats  were  anesthetized  with  urethane  (1.2  g/kg,  I.P.),  and  an 
extracellular  recording  electrode  was  progressively  lowered  from  the  surface  of 
the  cortex  through  the  nucleus  accumbens  (a  major  target  for  the  dopamine 
fibers  in  the  MFB).  Neurons  that  exhibited  operant  conditioning  were  found 
exclusively  in  medial  frontal  cortex  (Fig.  22). 
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Figure  20.  Neuronal  operant  conditioning  produced  by  combined 
administration  of  norepinephrine  (NE  0.5  mM)  and  the 
^-noradrenergic  receptor  antagonist  propranolol  (PROP  0.5  mM). 
Prolonged  elevation  of  firing  rates  after  reinforcement  is 
discontinued  is  characteristic  of  this  combination  of  drugs. 


Figure  21.  Neuronal  operant  conditioning  produced  by  combined 
administration  of  norepinephrine  (NE  0.5  mM)  and  the  02  dopamine 
receptor  agonist  N-0437  (0.5  mM).  This  mixture  of  drugs  sometimes 
produces  direct  stimulant  effects  on  neuronal  firing  rates  (not 
shown). 
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Figure  22.  Operant  conditioning  of  a  single  frontal  cortical  unit  in 
the  intact  brain  of  an  anesthetized  rat  using  electrical  stimulation 
(150  msec  train  of  0.2  msec  pulses  at  100  Hz,  4Q0pA)  of  the  medial 
forebrain  bundle  through  an  implanted  electrode  as  reinforcement 
In  this  experiment,  a  sharp  acquisition  curve  was  produced  by 
contingent  presentations  (REINF)  of  the  rewarding  electrical  stimulus 
after  bursts  of  firing;  noncontingent  presentations  ("FREE*)  of  the 
same  stimulus  were  ineffective.  BURST  •  train  of  6  or  more  spikes 
with  a  maximum  interspike  interval  of  15  ms. 


Hippocampal  Seif -Stimulants  f  »?) 

The  success  of  our  neuronal  operant  conditioning  experiments  in 
hippocampal  brain  slices  led  us  to  reexamine  hippocampal  seif-stimulation  at 
the  behavioral  level.  Although  there  are  published  reports  that  rats  will  lever- 
press  for  electrical  stimulation  of  the  dentate  gyrus  or  other  hippocampal  sites, 
the  rates  of  such  hippocampal  self-stimulation  are  very  low  (Ursin.  Ursin  and 
Olds,  1966).  In  an  initial  experiment,  we  were  unable  to  train  rats  to 
bar-press  for  hippocampal  self-stimulation,  even  after  extensive  shaping; 
however,  the  a  nose-poke  response  for  the  hippocampal  reward  was  rapidly 
learned  (Fig.  23).  In  a  second  experiment,  naive  raw  with  electrodes  in  the 
CAi,  CA3,  or  dentate  gyrus  areas  of  hippocampus  were  trained  to  work  for 
brain  stimulation  in  the  nose-poke  test,  then  were  switched  to  a  bar-press  test 
for  five  sessions,  and  finally  were  returned  to  the  nose-poke  test.  When  the 
rats  were  switched  to  the  bar-press  test  (heir  self-stimulation  rates  abruptly 
fell  to  20%  of  the  nose-poke  rate;  the  depressed  rates  recovered  immediately 
when  the  rats  were  returned  to  the  nose-poke  task.  In  pharmacological 
experiments,  we  found  that  amphetamine  ( I  mg/kg)  dramatically  increased 
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Figure  23.  Acquisition  curves  of  hippocampal  seif-stimulation  for 
two  groups  of  rats  reinforced  either  for  nose-poke  or  bar-press 
responses.  Rats  learned  the  nose-poke  response  spontaneously,  but 
could  not  learn  to  press  a  bar  for  the  hippocampal  stimulation,  even 
with  extensive  shaping.  Bars  represent  ♦  S.E.M.s. 


nose-poke  self-stimulation  rates  at  all  3  brain  sites.  Self-stimulation  rates 
were  increased  as  much  as  10-fold  in  some  cases,  strongly  implicating  a 
catecholamine  in  hippocampal  reward.  Naloxone  (2  mg/kg)  selectively 
decreased  self-stimulation  at  the  CA3  site,  suggesting  that  reinforcement 
associated  with  this  sice  may  be  regulated  by  endogenous  opioids. 


Nucleus  Accumbens  Self-Stimulation:  Evidence  of  Endorphin-Sfediated 

Reinforcement  (*2.  *S.  *19.  *20.  *21} 

The  opiate  antagonist  naloxone  suppresses  self-stimulation  of  the  nucleus 
accumbens  and  other  brain  areas  rich  in  endorphins.  In  a  series  of 
experiments,  we  showed  that  the  suppressive  of  naloxone  is  independent  of 
response  effort  (#21),  centrally  mediated  (#2,  20)  and  resembies  the  effects  of 
nonreinforcement  or  extinction  in  its  time  course  (#19).  These  results  support 
the  hypothesis  that  nucleus  accumbens  self-stimulation  depends  upon  the 
activation  of  endorphin  neurons  and  the  consequent  release  of  endogenous 
opioids  which  function  as  reward  transmitters.  If  this  hypothesis  were  correct, 
enhancement  of  endorphin  release  would  decrease  the  behavioral  efficacy  of 
naloxone  due  to  increased  competition  for  reward  receptors.  To  test  this  idea 
(#S).  endorphin  release  was  varied  by  systematic  manipulation  of  the  pulse 
frequency  of  the  rewarding  electrical  stimulus.  Animals  with  nucleus 
accumbens  electrodes  were  trained  in  one-hour  daily  sessions  to  nose-poke  for 
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Figure  24.  Naloxone  suppression  of  nucleus  accumbens 
self-stimulation  varies  inversely  with  stimulation  pulse  frequency. 
Mean  self-s:  :ul?tion  rates  in  the  last  45  minutes  of  the  1-hr  test 
are  plotted  as  a  function  of  stimulation  frequency  (N  -  9  at  each 
point).  Naloxone  scores  are  expressed  as  mean  percent  of  the  saline 
control  at  the  same  pulse  frequency.  Saline  scores  are  the  mean 
percent  of  the  saline  rate  at  100  Hz. 


electrical  brain  stimulation  (150-msec  train  of  0.2  msec  monophasic  square 
pulses,  100  Hz,  375  pA).  After  self-stimulation  rates  had  stabilized,  baseline 
pulse  frequency-response  curves  were  established  for  each  animal  in  the  range, 
25-400  Hz.  Such  pulse  frequency-response  curves  were  then  established 
following  injections  of  naloxone  (2  mg/kg,  s.c.)  and  saline  (1  ml/kg,  s.c.).  The 
open  circles  in  Figure  24  show  saline  self-stimulation  rates  at  each  pulse 
frequency  as  a  percent  of  the  saline  rate  at  100  Hz  (the  standard  pulse 
frequency  used  throughout  training).  Black  squares  represent  naloxone  self¬ 
stimulation  scores  as  a  percent  of  saline  self-stimulation  scores  at  each  of  the 
indicated  pulse  frequencies.  Consistent  with  the  endorphin  reward  hypothesis, 
naloxone  suppression  of  self-stimulation  decreased  substantially  with  increasing 
pulse  frequency.  These  results  are  consistent  with  the  idea  that  nucleus 
accumbens  self-stimulation  depends  on  the  activation  of  endorphin  reward 
receptors  and  that  naloxone’s  suppressant  action  is  associated  with  the 
blockade  of  these  receptors. 
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Conditioned  Place-Preference:  Evidence  of  Dopamine  D2  Receptor 
Involvement  in  Behavioral  Reinforcement  ( # 9 ) 

The  neuronal  operant  conditioning  experiments  implicate  a  dopamine  D2 
receptor  iu  reinforcement  processes.  This  hypothesis  was  tested  in  a 

behavioral  experiment,  in  which  the  conditioned  place-preference  method  was 
used  to  measure  reinforcement.  Previous  work  has  established  that  injections 
of  reinforcing  drugs  in  one  compartment  of  a  2-compartment  apparatus  induce 
a  preference  for  the  compartment  in  which  the  reinforcing  injections  had  been 
made.  Dopamine  D1  and  D2  receptor  agonists  were  tested  for  their  reinforcing 
action  in  this  test.  N-0437  (3  mg/kg),  a  dopamine  D2  receptor  agonist, 
induced  a  significant  place  preference,  whereas  SKF  38393  (20  mg/kg),  a 
specific  D1  receptor  agonist,  induced  no  such  preference  (Fig.  25).  These 
results  are  consistent  with  those  of  the  neuronal  operant  conditioning 
experiments  in  suggesting  that  the  D2,  and  not  the  Dl,  receptor  is  associated 
with  reinforcement. 


16' 


(6) 
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Figure  25.  Conditioned  place  preference  induced  by  the  dopamine 
D2  receptor  agonist  N-0437  (3  mg,  kg).  The  dopamine  Dl  receptor 
agonist,  SKF  38393  (20  mg.  kg),  had  no  significant  effect.  Bars 
represent  +  S.E.M.s. 
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Figure  26.  Phosphoinositide  (PI)  turnover  in  hippocampal  brain 
slices  induced  by  catecholamine  receptor  activation.  Norepinephrine 
(NE)  produced  a  significant  increase  in  PI  turnover  that  was  blocked 
by  the  a-receptor  antagonist  prazocine  (PRAZ),  but  not  by 
0-receptor  antagonist  propranolol  (PROP),  confirming  that  PI 
turnover  is  induced  by  o- noradrenergic  receptor  activation. 
Dopamine  and  the  dopamine  D2  receptor  agonist  N-0437  had  no 
effect  on  PI  turnover,  and  a  mixture  of  dopamine  and  the  dopamine 
D1  receptor  antagonist  SCH23390  (SCH)  even  seemed  to  suppress  PI 
turnover. 


Biochemical  Experiments 

Involvement  of  the  dopamine  D2  receptor  both  in  neuronal  and  behavioral 
reinforcement  raises  the  question  of  which  second  messenger  may  mediate  its 
intracellular  effects.  The  dopamine  D2  receptor,  unlike  the  D1  subtype,  is  not 
linked  to  adenylate  cyclase;  this  excludes  cyclic  AMP  as  a  second  messenger. 
Although  dopamine  is  not  thought  to  bo  a  potential  first  messenger  in  the 
inositide  pathway,  the  possibility  that  D2  receptor  activation  can  stimulate  this 
pathway  has  not  been  experimentally  excluded.  Accordingly,  we  used  the 
method  of  Berridge  et  a!  (Berridge,  Downes  &  Haniey,  1982)  to  monitor 
activation  of  the  inositol  pathway  in  vitro  by  exposure  of  tissue  to  various 
agonists. 

Results  from  experiments  using  hippocampus  are  summarized  in  Figure  26. 
Norepinephrine  (0.1  mM  NE)  stimulates  PI  turnover  as  been  reported  previously 
(Berridge  et  al  1982).  This  effect  is  blocked  by  me  u-noradrenergic  receptor 
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antagonist,  prazocine  (NE  +  PRAZ),  but  not  by  the  0- noradrenergic  receptor 
antagonist,  propranolol  (NE  +  PROP).  No  stimulation  of  PI  turnover  was 
observed  after  treatment  with  dopamine  (ImM  DA),  the  dopamine  D2  receptor 
agonist,  N-0437,  or  a  combination  of  dopamine  and  the  dopamine  D1  receptor 
antagonist,  SCH  23390  (DA  +  SCH).  These  results  suggest  that  dopamine  D2 
receptor  activation  does  not  trigger  the  formation  of  phosphoinositide  second 
messengers. 


Conclusions 

Cellular  applications  of  dopamine  or  cocaine  to  spontaneously  active  CA1 
pyramidal  cells  in  slices  of  rat  hippocampus  had  opposite  effects  on  subsequent 
firing  rates,  depending  on  the  activity  pattern  of  the  neuron  at  the  time  of 
drug  administration.  If  the  neuron  had  been  firing  rapidly  just  before  the 
injections,  the  firing  rate  was  increased.  However,  if  the  neuron  had  been 
firing  slowly  or  was  silent  at  the  time  of  injection,  the  firing  rate  was 
unaffected  or  decreased.  In  other  words,  the  action  of  locally-applied 
dopamine  or  cocaine  on  hippocampal  cells  was  activity-related  in  a  way  that 
formally  resembles  the  action  of  conventional  reinforcers  on  behavior.  A  food 
pellet  delivered  immediately  after  a  lever-press  response  increases  lever 
pressing,  whereas  the  same  pellet  delivered  independently  of  the  lever-press 
response  has  no  effect  or  even  may  suppress  the  behavior.  These  observations, 
therefore,  are  consistent  with  the  possibility  that  the  activity  of  individual 
neurons  may  be  operantly  conditioned  by  direct  cellular  applications  of 
reinforcing  transmitters  or  drugs.  If  so,  and  since  it  is  unlikely  that  a  brain 
cell  would  display  a  gratuitous  capacity  for  operant  conditioning,  the  individual 
neuron  could  be  an  important  functional  unit  for  positive  reinforcement  in  the 
brain. 

These  conclusions  are  supported  by  preliminary  results  on  the  effects  of 
delay  of  reinforcement  in  the  neuronal  operant  conditioning  paradigm.  In 
behavioral  operant  conditioning,  it  is  well  established  that  the  effectiveness  of 
reinforcement  is  sharply  reduced  when  the  presentation  of  the  reinforcing 
stimulus  is  substantially  delayed;  indeed,  we  found  that  a  delay  as  short  as  one 
second  caused  a  severe  decrement  in  the  acquisition  of  self -stimulation 
behavior.  A  similar  delay-of-reinforcement  decrement  was  observed  in  the 
neuronal  operant  conditioning  experiments;  in  this  case,  however,  delays  as 
short  as  200  ms  largely  eliminated  the  facilitating  action  of  N-0437  on 
hippocampal  CAl  bursting  activity.  The  steep  gradient  of  effectiveness  of 
delayed  reinforcement  makes  it  unlikely  that  nonspecific  stimulation  or  some 
artifact  of  the  injection  procedure  accounts  for  the  increase  in  neuronal  firing. 
Rather,  the  stringent  requirement  for  contingency  supports  the  idea  that  we 
have  identified  a  neuronal  conditioning  process  that  may  be  closely  related  to 
behavioral  operant  conditioning. 

We  have  begun  to  work  out  the  conditions  that  will  demonstrate  neuronal 
operant  conditioning  on  a  reliable  basis.  Thus,  we  find  at  present  the  most 
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satisfactory  preparation  for  our  operant  conditioning  experiments  to  be  the 
brain  slice,  the  best  neuron  to  be  the  large  pyramidal  cells  in  the  CA1  field  of 
dorsal  hippocampus,  the  most  appropriate  neuronal  response  for  reinforcement 
to  be  a  burst  of  activity  containing  3  or  more  spikes,  and  the  most  reliable 
reinforcing  agents  to  be  dopamine,  cocaine,  and  a  newly  developed  and 
selective  dopamine  D2  receptor  agonist,  N-0437.  There  is  already  an  indication 
of  specificity  in  the  role  of  the  dopamine  receptor  in  cellular  reinforcement. 
Included  among  substances  that  are  ineffective  are  GABA,  serotonin, 
acetylcholine,  imipramine,  ethanol,  and  saline.  The  reinforcing  action  of 
dopamine  is  blocked  by  chlorpromazine  and  the  selective  dopamine  D2 
antagonist  sulpiride,  suggesting  that  dopamine’s  cellular  reinforcing  action  is 
mediated  at  D2,  rather  than  Dl,  receptors.  As  noted  above,  this  conclusion  is 
supported  by  positive  experiments  with  the  selective  D2  receptor  agonist, 
N-0437,  which  may  be  more  reliable  than  dopamine  as  a  reinforcer  in  neuronal 
operant  conditioning.  The  D2  receptor  reinforcement  hypothesis  is  also 
supported  by  a  failure  of  the  selective  dopamine  Dl  antagonist,  SCH  23390,  to 
block  dopamine-reinforced  operant  conditioning.  In  fact,  the  combination  of 
dopamine  and  SCH  23390  provides  slightly  more  reliable  operant  conditioning 
than  dopamine  alone,  suggesting  that  selective  activation  of  dopamine  D2 
receptors  may  provide  greater  reinforcement  than  simultaneous  activation  of  Dl 
and  D2  receptors  together.  Preliminary  results  with  electrical  stimulation  as 
reinforcement  in  brain  slice  experiments  also  indirectly  supports  the  dopamine 
reinforcement  hypothesis.  In  these  experiments,  mild  electric  stimulation  in 
the  vicinity  of  dopamine  axons  in  the  nucleus  accumbens  reinforced  the 
bursting  of  accumbens  cells.  Noncontingent  applications  of  the  same  electric 
stimulation  failed  to  increase  the  rate  of  bursting. 

Dopamine  seems  to  be  a  more  effective  reinforcer  than  norepinephrine; 

however,  some  recent  experiments  suggest  that  the  efficacy  of  norepinephrine 
is  increased  if  it  is  combined  with  the  ^-noradrenergic  receptor  antagonist, 
propranolol.  This  result  suggests  that  selective  activation  of  a- noradrenergic 
receptors  may  provide  greater  reinforcement  than  simultaneous  activation  of  a 
and  $  receptors  together,  just  as  selective  activation  of  dopamine  D2  receptors 
may  be  more  favorable  than  the  joint  activation  of  Dl  and  D2  receptors. 

A  troublesome  feature  of  the  present  experiments  is  the  fact  that 
relatively  high  concentrations  of  the  effective  agents  (1  mM  of  dopamine  and 
cocaine,  and  10  mM  of  N-0437)  were  required  for  reinforcement.  However,  it 
should  oe  clear  that  total  drug  dose  is  determined  not  only  by  the 
concentration  of  the  solution  injected,  but  also  by  other  injection  parameters, 

such  as  duration  and  volume.  Because  drug  injections  in  this  experiment  had 
to  be  delivered  to  individual  cells  in  close  contingency  to  bursts  of  activity,  it 

was  necessary  to  use  exceedingly  short  injection  durations  (5-20  ms)  and  small 
volumes  (0.5-3  picoliters).  After  diffusion  to  action  sites,  these  minute 

droplets  of  drug  presumably  are  diluted  to  concentrations  comparable  to  those 
produced  in  more  conventional  neuropharmacological  studies,  where  lower 
initial  concentrations  of  drug  are  applied  in  greater  volumes  and  for  much 
longer  durations.  In  any  case,  until  more  is  known  about  the  local  distribution 
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and  metabolism  of  the  reinforcing  agents,  our  strategy  has  been  to  determine 
effective  concentrations  empirically  and  to  compare  these  relatively  high 
reinforcing  concentrations  with  identical  control  injections  applied 
noncontingently  or  after  a  delay. 

Finally,  we  have  begun  to  consider  the  biochemical  events  that  may 
mediate  the  cellular  reinforcement  process.  What  is  required  is  a  mechanism 
that  will  satisfy  the  following  conditions:  1)  if  a  brain  cell  with  the  capacity 
for  positive  reinforcement  discharges  in  a  burst  of  activity,  and  2)  if  that 
cell’s  catecholamine  or  endorphin  "reinforcement"  receptors  are  activated 
shortly  thereafter,  then  and  only  then,  3)  will  membrane  proteins,  which 
control  the  cell’s  excitability,  be  modified  to  increase  the  probability  of  future 
firing.  Clearly,  only  recently  active  cells  can  be  eligible  for  reinforcement. 
Three  possible  ionic  markers  of  recent  activity,  and  hence  reinforcement 
eligibility,  are  Na+  or  Ca++  influx,  or  K+  efflux.  Since  calcium  influx  is  a 
universal  signal  for  the  activation  of  intracellular  biochemistry,  we  assume  that 
calcium  influx  may  be  the  ionic  signal  that  primes  the  cell  for  the 
reinforcement  message. 

The  next  step  is  to  identify  the  intracellular  event  or  second  messenger 
that  may  be  activated  by  the  reinforcing  signal.  Such  second  messengers  could 
include  cyclic  AMP,  or  the  phosphoinositide  second  messengers,  diacylglycerol 
and  inositol  triphosphate,  or  other  substances,  including  some  that  are  still 
unknown.  Following  the  work  of  Kandel  (1984),  we  (Stein  and  Belluzzi,  1986) 
speculated  initially  that  the  second  messenger  associated  with  the 
reinforcement  signal  might  be  cyclic  AMP,  in  part  because  the  existence  of  a 
dopamine-activated  adenylate  cyclase  is  well  established  (Greengard,  1978). 
However,  such  dopamine  activation  of  adenylate  cyclase  is  known  to  be 
mediated  via  dopamine  D1  receptors,  while  our  work  suggests  that  a  D2 
receptor  is  more  likely  to  be  involved  in  cellular  reinforcement  (Belluzzi  and 
Stein,  1986).  Furthermore,  it  is  well  established  that  enkephalins  and 
rewarding  opiate  drugs  inhibit,  rather  than  activate,  adenylate  cyclase,  and  it 
has  been  speculated  that  such  inhibition  is  involved  in  their  reinforcing  action. 
It  seems  probable,  therefore,  that  second  messengers  other  than  cAMP  are 
involved  in  the  biochemical  mechanism  of  positive  reinforcement. 
Unfortunately,  the  second  messengers  associated  with  dopamine  D2  receptor 
activation  are  presently  unknown;  our  own  work,  described  above,  demonstrates 
that  phosphoinositide  second  messengers  are  not  involved.  Nevertheless, 
diacylglycerol  and  inositol  triphosphate  continue  to  intrigue  us,  in  part  because 
their  a- noradrenergic  first  messenger  receptor  has  been  implicated  in 
behavioral  reinforcement  and  long-term  memory  (Stein,  1978),  and  in  part 
because  their  associated  third  messenger,  protein  kinase  C,  is  a  calcium- 
dependent  brain  kinase,  which  can  be  activated  to  modify  membrane  proteins 
that  control  cellular  excitability  (Nishizuka,  1986).  It  is  also  known  that 
calcium  influx  shifts  protein  kinase  C  from  the  cytosol  to  its  membrane  bound 
form  (Schulman,  1984),  thereby  priming  and  positioning  the  enzyme  for 
activation  by  an  extracellular  (reinforcing?)  signal.  Finally,  inositol 
triphosphate,  by  mobilizing  intracellular  calcium,  could  activate  the  gene 
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transcription  and  protein  synthesis  that  may  be  necessary  for  the  long-term 

behavioral  changes  induced  by  positive  reinforcement. 

In  brief,  the  proposed  mechanism  of  positive  reinforcemenr  is  envisioned 
to  operate  in  the  following  manner.  In  certain  cells  capable  of  operant 
conditioning,  a  burst  of  firing  leading  to  strong  Ca++  influx  induces  a  state  of 
reinforcement  eligibility  by  briefly  shifting  protein  kinase  C  to  its  membrane 

bound  form.  In  this  window  of  opportunity,  protein  kinase  C  is  oriented  in 

close  conjunction  to  first- messenger  reinforcement  receptors  and  their 
second- messenger  enzyme,  phospholipase  C.  At  this  point,  activation  of  the 
first-messenger  reinforcement  receptors  by  norepinephrine  or  other  appropriate 
transmitters  stimulates  the  formation  of  diacylglycerol  and  inositol 

triphosphate.  These  second  messengers,  in  turn,  activate  the  membrane-bound 
protein  kinase  C  for  short-term  modification  of  membrane  proteins  that  control 
cellular  excitability.  Long-term  changes  in  excitability  may  be  initiated  by  the 
same  intracellular  messengers,  which  could  switch  on  genomic  events  leading  to 
long-term  behavioral  changes. 
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